A Unified Approach to Speculative Parallelization of Loops in DSM Multiprocessors

نویسندگان

Ye Zhang

Lawrence Rauchwerger

Josep Torrellas

چکیده

Speculative parallel execution of statically non-analyzable codes on Distributed Shared-Memory (DSM) multiprocessors is challenging because of the long latency and memory distribution present. However, such an approach may well be the best way of speeding up codes whose dependences can not be compiler analyzed. In this paper, we have extended past work by proposing a hardware scheme for the speculative parallel execution of loops that have a modest number of cross-iteration dependences. In this case, when a dependence violation is detected, we locally repair the state. Then, depending on the situation, we either re-execute one out-of-order iteration or, restart parallel execution from that point on. The general algorithm, called the Unified Privatization and Reduction algorithm (UPAR), privatizes, on demand, at cache-line level, executes reductions in parallel, merges the last values and partial results of reductions on-the-fly with minimum residual work at loop end. UPAR allows for completely dynamic scheduling and does not get slowed down if the working set of an iteration is larger than the cache size. Simulations indicate good speedups relative to sequential execution. The hardware support for reduction optimizations brings, on average, 50% performance improvement and can be used both in speculative and normal execution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors

Run-time parallelization is often the only way to execute the code in parallel when data dependence information is incomplete at compile time. This situation is common in many important applications. Unfortunately, known techniques for run-time parallelization are often computationally expensive or not general enough. To address this problem, we propose new hardware support for e cient run-time...

متن کامل

Speculative Parallel Execution of Loops with Cross-Iteration Dependences in DSM Multiprocessors

Speculative parallel execution of non-analyzable codes on Distributed Shared-Memory (DSM) multiprocessors is challenging due to the long-latency and distribution involved. However , such an approach may well be the best way of speeding up codes whose dependences can not be compiler analyzed. In previous work, we suggested executing the loop speculatively in parallel and adding extensions to the...

متن کامل

A Feasibility Study of Hardware Speculative Parallelization in Snoop-Based Multiprocessors

Run-time parallelization is a technique for par-allelizing programs with data access patterns dif-cult to analyze at compile time. In this paper we examine the hardware implementation of a run-time parallelization scheme, called speculative parallelization, on snoop-based multiproces-sors. The implementation is based on the idea of embedding dependence checking logic into the cache controller o...

متن کامل

When All Else Fails, Guess: The Use of Speculative Multithreading for High-Performance Computing

| Fundamental physical limits are being encountered in the design of integrated circuits that will limit future increases in processor clock rates. As a result, computer architects are developing aggressive new mechanisms to execute instructions speculatively, that is, before it is known whether or not they should actually be executed, and even before the input values needed by the instructions...

متن کامل

Techniques for Module - Level Speculative Parallelization on Shared - Memory Multiprocessors Research Proposal

Multiprocessors have hit the mainstream and cover the whole spectrum of computational needs from small-scale symmetric multiprocessors to scalable distributed shared-memory systems with a few hundred processors. This has made it possible to boost the performance of a number of important applications from the numeric and database domain. Extending the scope of applications that can take advantag...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

A Unified Approach to Speculative Parallelization of Loops in DSM Multiprocessors

نویسندگان

چکیده

منابع مشابه

Hardware for Speculative Run-Time Parallelization in Distributed Shared-Memory Multiprocessors

Speculative Parallel Execution of Loops with Cross-Iteration Dependences in DSM Multiprocessors

A Feasibility Study of Hardware Speculative Parallelization in Snoop-Based Multiprocessors

When All Else Fails, Guess: The Use of Speculative Multithreading for High-Performance Computing

Techniques for Module - Level Speculative Parallelization on Shared - Memory Multiprocessors Research Proposal

عنوان ژورنال:

اشتراک گذاری